Project Title: Comparative Performance Analysis of Kolmogorov Arnold Network and Neural Network in Discovering Dynamics of Linear and Nonlinear Systems.

System identification is a characterizing process of a system where researchers or engineers build a mathematical block of the system's dynamics using the input and output data. Oftentimes, it is not possible to access the physical system and it is relatively less complex to build mathematical models for illustrating system's responses.

In this project, we will try to identify different linear and nonlinear systems using Kolmogorov-Arnold Network and compare the result with Neural Network based system identification models. We will consider one 3rd order linear time invariant system and one 2 nd order nonlinear system due to lack of computational resources.

Komogorov Arnold Network uses B-Splines as the weight and these weights are parametrized.

The whole notebook will cover the following sections:

# Topic
01 Data Generation, Visualization and Processing
02 ML Model Development & Learning
03 Derriving Model's Governing Equation
04 Conclusion

Required Libraries: We will use following libraries throught this project:

  • Numpy
  • SciPy
  • Python Symbolic Regressor
  • Symbolic Python for mathematics
  • Torch Optimizer
  • Scipy
  • Sci-Kit Learn Train Test Split
  • Kolmogorov Arnold Network
  • Matplotlib
  • Plotly
  • Warnings

Let us import the required libraries:

In [1]:
from    models                  import  * 
from    kan                     import  *
from    data_generation         import  *
from    utils                   import  *
from    training                import  *
import  torch.nn.init           as      init
import  matplotlib.pyplot       as      plt
import  plotly.graph_objects    as      go
from    mpl_toolkits.mplot3d    import  Axes3D

# ignore user warnings
import  warnings
warnings.filterwarnings("ignore", category=UserWarning)

#Set default data type
torch.set_default_dtype(torch.float64)
%matplotlib inline

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)
Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython
cuda

Data Generation, Visualization and Processing

Linear System:

We considered the following 3rd order linear system for system identification -

$$ \dot{x} = -0.1 x - 2 y $$ $$ \dot{y} = 2 x - 0.1 y $$

$$ \dot{z} = -0.3 z $$

Nonlinear System:

We used predator pray model for nonlinear system identification in this projetc -

$$ \text{Prey: } \quad \dot{x} = 1.5 x - x y $$ $$ \text{Predator: } \quad \dot{y} = xy - 3 y $$

Data Generation

First, we defined the differential equations as a function so that we could solve them using SciPy Integrate and obtain the state values for each set. Then, we defined a lambda function to calculate the derivative of the states. Finally, we defined a function called create_dataset to generate the dataset for learning purposes.

The create_dataset function takes following inputs:

  • System
  • Lambda Function Associated with the system
  • Initial State
  • Number of samples
  • Time Span
  • function's output mode
  • Device
  • Seed

After generating the data, we will split the dataset into train and test data. We will use utils.py and data_generation.py to generate the dataset for both linear and nonlinear system.

Data Visualization

Linear System

Dataset Description

In [2]:
print(f"************************\n",
      f"\nLinear Systems Dataset\n",
      f"\n************************\n",
      f"\nTotal Sample Size: {dataset_linear['train_input'].shape[0]}",
      f"\nInput Sample Size: {data_linear['train_input'].shape[0]}",
      f"\nTest Sample Size: {data_linear['test_input'].shape[0]}",
      f"\nInput Features: {data_linear['train_input'].shape[1]}",
      f"\nInput Features: {data_linear['test_label'].shape[1]}") 
************************
 
Linear Systems Dataset
 
************************
 
Total Sample Size: 5000 
Input Sample Size: 4000 
Test Sample Size: 1000 
Input Features: 3 
Input Features: 3

Let us see, how the states and derrivative of states change with respect to time.

In [3]:
# Create subplots
fig, axs = plt.subplots(1, 2, figsize=(12, 6))
colors = ['r', 'g', 'b', 'm']  

# Define plot configurations
plot_configs = [(0, 'Train Data', 'State'),
                (1, 'Train Labels', 'Output')
                ]

# Plot input and label for each state dimension
for i in range(dataset_linear['train_input'].shape[1]):
    axs[0].plot(t_linear,
                dataset_linear['train_input'][:, i].cpu().numpy(),
                color=colors[i % len(colors)],
                label=f'State {i+1}')
    axs[1].plot(t_linear,
                dataset_linear['train_label'][:, i].cpu().numpy(),
                color=colors[i % len(colors)],
                label=f'Output {i+1}')
# Plot each subplot using the config
for sub_plot, data_class, data_labels in plot_configs:
    axs[sub_plot].set_xlabel('Time Steps', fontsize=10)
    axs[sub_plot].set_ylabel(f'Evolution of {data_labels} for {data_class}', fontsize=10)
    axs[sub_plot].grid(True)
    axs[sub_plot].legend()

plt.tight_layout()
plt.show()
No description has been provided for this image
In [4]:
# Extract data as numpy arrays
x_input = dataset_linear['train_input'][:, 0].cpu().numpy()
y_input = dataset_linear['train_input'][:, 1].cpu().numpy()
z_input = dataset_linear['train_input'][:, 2].cpu().numpy()

x_label = dataset_linear['train_label'][:, 0].cpu().numpy()
y_label = dataset_linear['train_label'][:, 1].cpu().numpy()
z_label = dataset_linear['train_label'][:, 2].cpu().numpy()

# Create traces
trace_input = go.Scatter3d(x=x_input, y=y_input, z=z_input, mode='lines',
                           line=dict(color='purple', width=3), name='Input States')

trace_label = go.Scatter3d(x=x_label, y=y_label, z=z_label, mode='lines',
                           line=dict(color='orange', width=3), name='Rate of Change of States')

# Layout
layout = go.Layout(title="3D Phase Space of the Linear System",
                   scene=dict(xaxis=dict(title='x'),
                              yaxis=dict(title='y'),
                              zaxis=dict(title='z')),
                    legend=dict(x=0.02, y=0.98), margin=dict(l=0, r=0, b=0, t=50))

# Figure
fig = go.Figure(data=[trace_input, trace_label], layout=layout)

# Show interactive plot
fig.show()

Nonlinear System

Dataset Description

In [5]:
print(f"************************\n",
      f"\nLinear Systems Dataset\n",
      f"\n************************\n",
      f"\nTotal Sample Size: {dataset_nonlinear['train_input'].shape[0]}",
      f"\nInput Sample Size: {data_nonlinear['train_input'].shape[0]}",
      f"\nTest Sample Size: {data_nonlinear['test_input'].shape[0]}",
      f"\nInput Features: {data_nonlinear['train_input'].shape[1]}",
      f"\nInput Features: {data_nonlinear['test_label'].shape[1]}") 
************************
 
Linear Systems Dataset
 
************************
 
Total Sample Size: 1000 
Input Sample Size: 800 
Test Sample Size: 200 
Input Features: 2 
Input Features: 2

Plotting Nonlinear Dataset

In [6]:
# Create subplots
fig, axs = plt.subplots(2, 2, figsize=(12, 8))
colors = ['r', 'g', 'b', 'm']  

# Define plot configurations
plot_configs = [(0, 0, 'State x'),
                (0, 1, 'State y'),
                (1, 0, 'rate of change of x'),
                (1, 1, 'rate of change of y'),
                ]

axs[0, 0].plot(t_nonlinear, dataset_nonlinear['train_input'][:, 0].cpu().numpy(), color = colors[0], label = 'x')
axs[0, 1].plot(t_nonlinear, dataset_nonlinear['train_input'][:, 1].cpu().numpy(), color = colors[1], label = 'y')
axs[1, 0].plot(t_nonlinear, dataset_nonlinear['train_label'][:, 0].cpu().numpy(), color = colors[2], label = r'$\dot{x}$')
axs[1, 1].plot(t_nonlinear, dataset_nonlinear['train_label'][:, 1].cpu().numpy(), color = colors[3], label = r'$\dot{y}$')

# Plot each subplot using the config
for row, col, data_labels in plot_configs:
    axs[row, col].set_xlabel('Time Steps', fontsize=10)
    axs[row, col].set_ylabel(f'Evolution of {data_labels}', fontsize=10)
    axs[row, col].grid(True)
    axs[row, col].legend()

plt.tight_layout()
plt.show()
No description has been provided for this image
In [7]:
# Extract data
x_input_nl = dataset_nonlinear['train_input'][:, 0].cpu().numpy()
y_input_nl = dataset_nonlinear['train_input'][:, 1].cpu().numpy()
z_input_nl = np.zeros_like(x_input_nl)  

x_label_nl = dataset_nonlinear['train_label'][:, 0].cpu().numpy()
y_label_nl = dataset_nonlinear['train_label'][:, 1].cpu().numpy()
z_label_nl = np.zeros_like(x_label_nl)  

# Create the trace
trace_input_nl = go.Scatter3d(x=x_input_nl, y=y_input_nl, z=z_input_nl, mode='lines',
                              line=dict(color='purple', width=3), name='States')

# Create the trace
trace_label_nl = go.Scatter3d(x=x_label_nl, y=y_label_nl, z=z_label_nl, mode='lines',
                              line=dict(color='green', width=3), name='Rate of Change of States')

# Layout
layout = go.Layout(title="3D Phase Space of the Nonlinear System",
                   scene=dict(xaxis=dict(title='x'), yaxis=dict(title='y'), zaxis=dict(title='z')),
                   margin=dict(l=0, r=0, b=0, t=50), legend=dict(x=0.02, y=0.98))

# Figure
fig = go.Figure(data=[trace_input_nl, trace_label_nl], layout=layout)

# Show interactive plot
fig.show()

ML Model Development & Learning

We deined the Neural Network models in models.py file. The file includes following models:

  • Neural Network Model for Linear System
  • Neural Network Model for Nonlinear System

KAN models are defined in this chapter.

The training.py file includes a trainer object that we will be using to train our models. Follwing table provides us the information about activation functions, optimizers, loss function, metric and learning rate scheduler for all models.

No Parameters Kolmogorov Arnold Networks Artificial Neural Network
01 Activation Function Sigmoid Linear Unit for both Systems Relu for both systems
02 Optimizer Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) for both system Adam for both system
03 Loss Function Root Mean Squared Error for both system Mean Squared Error for both system
04 Metric None MAE for both system

Defining KAN Model for Linear System

The linear KAN model consists of two layers: input and output layer. The input layer takes in 3 features and outputs 3 features. The degree of the B-spline polynomial is 3, and the number of grids on the B-spline is also set to 3.

In [8]:
model_linear = KAN(width=[3,3], grid=3, k=3, seed=1, device=device, auto_save = True)
checkpoint directory created: ./model
saving model version 0.0

We can plot the model’s initial diagram after passing the training data through it one time.

In [9]:
model_linear(data_linear['train_input'])
model_linear.plot(beta = 10,
                  in_vars=[r'$x_{}$'.format(i+1) for i in range(model_linear.width[0][0])],
                  out_vars = [r'$y_{}$'.format(i+1) for i in range(model_linear.width[-1][0])])
No description has been provided for this image

Training the Model:

We will train the KAN model for 100 steps.

In [10]:
steps = 100
history_linear = model_linear.fit(data_linear, opt="LBFGS", steps=steps, lamb=0.01)
| train_loss: 7.49e-02 | test_loss: 7.65e-02 | reg: 4.59e+00 | : 100%|█| 100/100 [00:41<00:00,  2.41
saving model version 0.1

Plot of Loss & Regularization Curve

In [11]:
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(history_linear['train_loss'], label = 'Train Loss')
plt.plot(history_linear['test_loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train Vs Test Loss')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history_linear['reg'], label = 'Regularization')
plt.xlabel('Epoch')
plt.ylabel('Regularization')
plt.title('Change in Regularization')
plt.grid()
plt.legend()
Out[11]:
<matplotlib.legend.Legend at 0x210a4587770>
No description has been provided for this image

Model Diagram after Training

In [12]:
model_linear.plot(beta = 10,
                  in_vars=[r'$x_{}$'.format(i+1) for i in range(model_linear.width[0][0])],
                  out_vars = [r'$y_{}$'.format(i+1) for i in range(model_linear.width[-1][0])])
No description has been provided for this image

Model's Prediction & Comparison with True Labels

In [13]:
preds_linear = model_linear(dataset_linear['train_input'])

fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, sharex=True, figsize=(12, 4))

ax1.plot(t_linear, dataset_linear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_linear, preds_linear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()

ax2.plot(t_linear, dataset_linear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_linear, preds_linear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()

ax3.plot(t_linear, dataset_linear['train_label'][:, 2].cpu().numpy(), 'k--', label = 'True')
ax3.plot(t_linear, preds_linear[:, 2].detach().cpu().numpy(), 'r--', label = 'Pred')
ax3.legend()
Out[13]:
<matplotlib.legend.Legend at 0x210a222ffb0>
No description has been provided for this image

From the loss curve and model plot, we see that the model is not doing well with polynomial degree of 3 and grid as 3. Let us initialize a more fine grained KAN with grid points 10 and train the model again.

In [14]:
model_linear = model_linear.refine(10)
result = model_linear.fit(data_linear, opt="LBFGS", steps=100, lamb=0.01)
saving model version 0.2
| train_loss: 6.36e-02 | test_loss: 6.60e-02 | reg: 4.57e+00 | : 100%|█| 100/100 [00:36<00:00,  2.75
saving model version 0.3

Since the loss has decreased, we can now refine the grids iteratively.

In [15]:
grids = np.array([3,10,20,50,100])
train_losses_linear = []
test_losses_linear = []
steps = 200
k = 3

for i in range(grids.shape[0]):
    if i == 0:
        model_linear = KAN(width=[3,3], grid=grids[i], k=k, seed=1, device=device)
    if i != 0:
        model_linear = model_linear.refine(grids[i])
    results = model_linear.fit(data_linear, opt="LBFGS", steps=steps)
    train_losses_linear += results['train_loss']
    test_losses_linear += results['test_loss']
checkpoint directory created: ./model
saving model version 0.0
| train_loss: 2.13e-04 | test_loss: 2.19e-04 | reg: 5.98e+00 | : 100%|█| 200/200 [01:08<00:00,  2.93
saving model version 0.1
saving model version 0.2
| train_loss: 1.41e-05 | test_loss: 1.56e-05 | reg: 5.96e+00 | : 100%|█| 200/200 [01:04<00:00,  3.10
saving model version 0.3
saving model version 0.4
| train_loss: 1.82e-06 | test_loss: 1.97e-06 | reg: 5.95e+00 | : 100%|█| 200/200 [01:16<00:00,  2.63
saving model version 0.5
saving model version 0.6
| train_loss: 3.47e-07 | test_loss: 3.66e-07 | reg: 5.95e+00 | : 100%|█| 200/200 [01:14<00:00,  2.70
saving model version 0.7
saving model version 0.8
| train_loss: 9.50e-08 | test_loss: 1.12e-07 | reg: 5.95e+00 | : 100%|█| 200/200 [01:28<00:00,  2.27
saving model version 0.9

RMSE Vs Steps in Grid Iteration:

Now, let us look at the training dynamics of the loss. It turns out that the training dynamics exhibit a staircase pattern, where the loss suddenly drops after each grid refinement.
In [16]:
plt.plot(train_losses_linear)
plt.plot(test_losses_linear)
plt.legend(['train', 'test'])
plt.ylabel('RMSE')
plt.title("Linear System: Training Dynamics of Loss")
plt.xlabel('step')
plt.yscale('log')
No description has been provided for this image

RMSE Vs Number of Parameters in KAN

In [17]:
n_params_linear = 4 * (grids + 3)
train_vs_G_linear = train_losses_linear[(steps-1)::steps]
test_vs_G_linear = test_losses_linear[(steps-1)::steps]
plt.plot(n_params_linear, train_vs_G_linear, marker="o")
plt.plot(n_params_linear, test_vs_G_linear, marker="o")
plt.plot(n_params_linear, 100*n_params_linear**(-4.), ls="--", color="black")
plt.xscale('log')
plt.yscale('log')
plt.legend(['train', 'test', r'$N^{-4}$'])
plt.xlabel('number of params')
plt.ylabel('RMSE')
plt.title("Linear Systems: Model's Parameter Vs RMSE")
Out[17]:
Text(0.5, 1.0, "Linear Systems: Model's Parameter Vs RMSE")
No description has been provided for this image

Model's Final Diagram:

The model looks like the follwing after grid refinement:
In [18]:
model_linear.plot(beta = 10,
                  in_vars=[r'$x_{}$'.format(i+1) for i in range(model_linear.width[0][0])],
                  out_vars = [r'$y_{}$'.format(i+1) for i in range(model_linear.width[-1][0])])
No description has been provided for this image

Final Model's Prediction and Comparison with True Labels

In [19]:
preds_linear = model_linear(dataset_linear['train_input'])

fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, sharex=True, figsize=(12, 4))

ax1.plot(t_linear, dataset_linear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_linear, preds_linear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()

ax2.plot(t_linear, dataset_linear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_linear, preds_linear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()

ax3.plot(t_linear, dataset_linear['train_label'][:, 2].cpu().numpy(), 'k--', label = 'True')
ax3.plot(t_linear, preds_linear[:, 2].detach().cpu().numpy(), 'r--', label = 'Pred')
ax3.legend()
Out[19]:
<matplotlib.legend.Legend at 0x210b3204bc0>
No description has been provided for this image

We will follow the same steps for learning nonlinear dynamics.

Learning Nonlinear Systems with KAN

The nonlinear KAN model has 3 layers: input, hidden and output. It takes in two features and produces 2 features in the output layer. The hidden layer has 4 nodes. The number of B-Spline polinomial is set to 3 and the number of grid points is 5 for this model.

Defining Model

In [20]:
model_nonlinear = KAN(width=[2, 4, 2], grid=3, k=3, seed=1, device=device, auto_save = True)
checkpoint directory created: ./model
saving model version 0.0

Initial Model Plot

In [21]:
model_nonlinear(data_nonlinear['train_input'])
model_nonlinear.plot(beta = 10,
                  in_vars=[r'$x_{}$'.format(i+1) for i in range(model_nonlinear.width[0][0])],
                  out_vars = [r'$y_{}$'.format(i+1) for i in range(model_nonlinear.width[-1][0])])
No description has been provided for this image

Training the Model

In [22]:
history_nonlinear = model_nonlinear.fit(data_nonlinear, opt="LBFGS", steps=100, lamb=0.01)
| train_loss: 4.99e-02 | test_loss: 5.04e-02 | reg: 1.72e+01 | : 100%|█| 100/100 [01:00<00:00,  1.65
saving model version 0.1
In [23]:
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(history_nonlinear['train_loss'], label = 'Train Loss')
plt.plot(history_nonlinear['test_loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train Vs Test Loss')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history_nonlinear['reg'], label = 'Regularization')
plt.xlabel('Epoch')
plt.ylabel('Regularization')
plt.title('Change in Regularization')
plt.grid()
plt.legend()
Out[23]:
<matplotlib.legend.Legend at 0x2109d1da8d0>
No description has been provided for this image

Model Diagram after Training

In [24]:
model_nonlinear.plot(beta = 10,
                  in_vars=[r'$x_{}$'.format(i+1) for i in range(model_nonlinear.width[0][0])],
                  out_vars = [r'$y_{}$'.format(i+1) for i in range(model_nonlinear.width[-1][0])])
No description has been provided for this image

Let's initialize a more fine-grained KAN with G=10 to see wether the loss goes gown or not.

In [25]:
model_nonlinear = model_nonlinear.refine(10)
result = model_nonlinear.fit(data_nonlinear, opt="LBFGS", steps=100, lamb=0.01)
saving model version 0.2
| train_loss: 1.84e-02 | test_loss: 1.87e-02 | reg: 1.69e+01 | : 100%|█| 100/100 [01:08<00:00,  1.46
saving model version 0.3

As the loss goes down, let us train more fine-grained KAN iteratively.

In [26]:
grids = np.array([3,10,20,50,100])
train_losses_nonlinear = []
test_losses_nonlinear = []
steps = 250
k = 3

for i in range(grids.shape[0]):
    if i == 0:
        model_nonlinear = KAN(width=[2,4,2], grid=grids[i], k=k, seed=1, device=device)
    if i != 0:
        model_nonlinear = model_nonlinear.refine(grids[i])
    results = model_nonlinear.fit(data_nonlinear, opt="LBFGS", steps=steps)
    train_losses_nonlinear += results['train_loss']
    test_losses_nonlinear += results['test_loss']
checkpoint directory created: ./model
saving model version 0.0
| train_loss: 1.63e-03 | test_loss: 1.66e-03 | reg: 2.50e+01 | : 100%|█| 250/250 [01:54<00:00,  2.19
saving model version 0.1
saving model version 0.2
| train_loss: 9.99e-05 | test_loss: 9.43e-05 | reg: 2.50e+01 | : 100%|█| 250/250 [01:54<00:00,  2.18
saving model version 0.3
saving model version 0.4
| train_loss: 1.18e-05 | test_loss: 1.38e-05 | reg: 2.50e+01 | : 100%|█| 250/250 [01:55<00:00,  2.17
saving model version 0.5
saving model version 0.6
| train_loss: 2.29e-06 | test_loss: 2.87e-06 | reg: 2.50e+01 | : 100%|█| 250/250 [01:49<00:00,  2.28
saving model version 0.7
saving model version 0.8
| train_loss: 1.25e-06 | test_loss: 3.43e-06 | reg: 2.50e+01 | : 100%|█| 250/250 [02:04<00:00,  2.01
saving model version 0.9

RMSE Vs Steps in Grid Iteration

In [27]:
plt.plot(train_losses_nonlinear)
plt.plot(test_losses_nonlinear)
plt.legend(['train', 'test'])
plt.ylabel('RMSE')
plt.title("Nonlinear System: Training Dynamics of Loss")
plt.xlabel('step')
plt.yscale('log')
No description has been provided for this image

RMSE Vs Number of Parametrs in KAN

In [28]:
n_params_nonlinear = 16 * (grids + 3)
train_vs_G_nonlinear = train_losses_nonlinear[(steps-1)::steps]
test_vs_G_nonlinear = test_losses_nonlinear[(steps-1)::steps]
plt.plot(n_params_nonlinear, train_vs_G_nonlinear, marker="o")
plt.plot(n_params_nonlinear, test_vs_G_nonlinear, marker="o")
plt.plot(n_params_nonlinear, 100*n_params_nonlinear**(-4.), ls="--", color="black")
plt.xscale('log')
plt.yscale('log')
plt.legend(['train', 'test', r'$N^{-4}$'])
plt.xlabel('number of params')
plt.ylabel('RMSE')
plt.title("Nonlinear Systems: Model's Parameter Vs RMSE")
Out[28]:
Text(0.5, 1.0, "Nonlinear Systems: Model's Parameter Vs RMSE")
No description has been provided for this image

Model's Final Diagram

In [29]:
model_nonlinear.plot(beta = 10,
                  in_vars=[r'$x_{}$'.format(i+1) for i in range(model_nonlinear.width[0][0])],
                  out_vars = [r'$y_{}$'.format(i+1) for i in range(model_nonlinear.width[-1][0])])
No description has been provided for this image

Final Model's Prediction and Comparison with True Labels

In [30]:
preds_nonlinear = model_nonlinear(dataset_nonlinear['train_input'])

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharex=True, figsize=(12, 4))

ax1.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_nonlinear, preds_nonlinear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()

ax2.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_nonlinear, preds_nonlinear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
Out[30]:
<matplotlib.legend.Legend at 0x2112993a390>
No description has been provided for this image

Learning Linear Systems with Neural Network

In [31]:
NN_linear =NNM_liniear(input_size = 3, output_size = 3, hidden_size = 32, num_layers = 2)
NN_linear.to(device)
Out[31]:
NNM_liniear(
  (fc): Linear(in_features=3, out_features=3, bias=True)
)
In [32]:
Metrics = metrics()
optimizer_linear = optim.AdamW(NN_linear.parameters(), lr=1e-3)
criterion = torch.nn.MSELoss()
# Define ReduceLROnPlateau scheduler
scheduler_linear = ReduceLROnPlateau(optimizer_linear,
                                        mode='min',
                                        factor=0.7,
                                        patience=5)
In [33]:
NN_Trainer_linear = trainer(model = NN_linear,
                  trainloader = trainloader_linear, 
                  valloader = valloader_linear,
                  num_epochs = 280,
                  optimizer = optimizer_linear,
                  loss_func = criterion,
                  metrics = Metrics,
                  device = device,
                  scheduler  = scheduler_linear
                  )

NNM_history_linear = NN_Trainer_linear.fit()
100%|██████████| 280/280 [01:13<00:00,  3.80it/s, Train_MAE=6.06e-7, Train_loss=4.67e-16, Val_MAE=1.17e-6, Val_loss=4.41e-16]  
Train Loss 0.0000, Val Loss: 0.0000, Train MAE: 0.0000, Val MAE: 0.0000, Train Adjusted R²: 1.0000, Val Adjusted R²: 1.0000
In [34]:
# Plot training and validation loss and accuracy
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(NNM_history_linear['Average Train Loss'], label = 'Train Loss')
plt.plot(NNM_history_linear['Average Val Loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Vs Validation Loss')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(NNM_history_linear['Average Train MAE'], label = 'Average Train MAE')
plt.plot(NNM_history_linear['Average Val MAE'], label = 'Average Validation MAE')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.title('Training Vs Validation MAE Comparison')
plt.grid()
plt.legend()
Out[34]:
<matplotlib.legend.Legend at 0x21105a228d0>
No description has been provided for this image

Plotting the Neural Network's Prediction for Linear System

In [35]:
NNM_preds_linear = NN_linear(dataset_linear['train_input'])
In [36]:
fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, sharex=True, figsize=(12, 4))
ax1.plot(t_linear, dataset_linear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_linear, NNM_preds_linear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()

ax2.plot(t_linear, dataset_linear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_linear, NNM_preds_linear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()

ax3.plot(t_linear, dataset_linear['train_label'][:, 2].cpu().numpy(), 'k--', label = 'True')
ax3.plot(t_linear, NNM_preds_linear[:, 2].detach().cpu().numpy(), 'r--', label = 'Pred')
ax3.legend()
Out[36]:
<matplotlib.legend.Legend at 0x2110a7af9e0>
No description has been provided for this image

Learning Nonlinear Systems with Neural Network

In [37]:
NN_nonlinear = NNM_nonlinear(input_size=2, output_size=2, hidden_size=16, seed=1)
NN_nonlinear.to(device)
Out[37]:
NNM_nonlinear(
  (fc1): Linear(in_features=2, out_features=16, bias=True)
  (fc2): Linear(in_features=16, out_features=2, bias=True)
)
In [38]:
optimizer_nonlinear = optim.AdamW(NN_nonlinear.parameters(), lr=1e-3)

# Define ReduceLROnPlateau scheduler
scheduler_nonlinear = ReduceLROnPlateau(optimizer_nonlinear,
                                        mode='min',
                                        factor=0.7,
                                        patience=4)
In [39]:
NN_Trainer_nonlinear = trainer(model = NN_nonlinear,
                               trainloader = trainloader_nonlinear,
                               valloader = valloader_nonlinear,
                               num_epochs = 500,
                               optimizer = optimizer_nonlinear,
                               loss_func = criterion,
                               metrics = Metrics,
                               device = device,
                               scheduler  = scheduler_nonlinear
                               )
NNM_history_nonlinear = NN_Trainer_nonlinear.fit()
100%|██████████| 500/500 [00:32<00:00, 15.34it/s, Train_MAE=0.149, Train_loss=4.15, Val_MAE=0.148, Val_loss=3.83]
Train Loss 4.1487, Val Loss: 3.8266, Train MAE: 0.1495, Val MAE: 0.1478, Train Adjusted R²: 0.7373, Val Adjusted R²: 0.7434
In [40]:
NNM_history_linear.keys()
Out[40]:
dict_keys(['Average Train Loss', 'Average Train MAE', 'Average Train R2', 'Average Train Adjusted R2', 'Average Val Loss', 'Average Val MAE', 'Average Val R2', 'Average Val Adjusted R2'])
In [41]:
# Plot training and validation loss and accuracy
plt.figure(figsize=(10,4))
plt.subplot(1, 2, 1)
plt.plot(NNM_history_nonlinear['Average Train Loss'], label = 'Train Loss')
plt.plot(NNM_history_nonlinear['Average Val Loss'], label = 'Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss Vs Validation Loss')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(NNM_history_nonlinear['Average Train Adjusted R2'], label = 'Average Train Adjusted R Squared')
plt.plot(NNM_history_nonlinear['Average Val Adjusted R2'], label = 'Average Validation Adjusted R Squared')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.title('Training Vs Validation Adjusted R Squared Comparison')
plt.grid()
plt.legend()
Out[41]:
<matplotlib.legend.Legend at 0x211058ea8d0>
No description has been provided for this image

Plotting the Neural Network's Prediction for Nonlinear System

In [42]:
NNM_preds_nonlinear = NN_nonlinear(dataset_nonlinear['train_input'])
NNM_preds_nonlinear.shape, t_nonlinear.shape
Out[42]:
(torch.Size([1000, 2]), (1000,))
In [43]:
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharex=True, figsize=(12, 4))

ax1.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 0].cpu().numpy(), 'k--', label = 'True')
ax1.plot(t_nonlinear, NNM_preds_nonlinear[:, 0].detach().cpu().numpy(), 'r--', label = 'Pred')
ax1.legend()

ax2.plot(t_nonlinear, dataset_nonlinear['train_label'][:, 1].cpu().numpy(), 'k--', label = 'True')
ax2.plot(t_nonlinear, NNM_preds_nonlinear[:, 1].detach().cpu().numpy(), 'r--', label = 'Pred')
ax2.legend()
Out[43]:
<matplotlib.legend.Legend at 0x2109d1e5f10>
No description has been provided for this image

Derriving Model's Governing Equation

Now, the Symbolic KAN Layer provided by the KAN library supports only univariate functions. However, the predator-prey model includes a nonlinear term (xy). As a result, the default Symbolic KAN Layer is unable to produce the differential equations of the predator-prey system. To address this limitation, we will use Python Symbolic Regression (PySR) to discover the governing equations represented by the model.

Python Symbolic Regression

To derrive the governing equation using PySR, we need to follow following steps:

  • Define Python Symbolic Mappings
  • Define Symbolic Regressor
  • Define Inputs and Outputs of the Symbolic Regressor and
  • Fit the Regressor to produce model's governing equations.

Our defined Python Symbolic Mappings and Symbolic Regressor is in model.py file.

Governing Equations of KAN Model for Linear System

In [44]:
X_linear = dataset_linear['train_input']
KAN_equations_Linear = Equation_Generator(model_linear, dataset_linear['train_input'],  variable_names=["x", "y", "z"])
🔍 Fitting Model for it's Governing Equtions...

✅ Completed Symbolic Regression for the Given Model.

In [45]:
simplify(KAN_equations_Linear[0].equation)
Out[45]:
$\displaystyle - 0.1 x - 2.0 y$
In [46]:
simplify(KAN_equations_Linear[1].equation)
Out[46]:
$\displaystyle 2 x - 0.1 y$
In [47]:
simplify(KAN_equations_Linear[2].equation)
Out[47]:
$\displaystyle - 0.3 z$

Governing Equations of KAN Model for Nonlinear System

In [48]:
KAN_equations_nonlinear = Equation_Generator(model_nonlinear, dataset_nonlinear['train_input'], variable_names=["x", "y"])
🔍 Fitting Model for it's Governing Equtions...

✅ Completed Symbolic Regression for the Given Model.

In [49]:
simplify(KAN_equations_nonlinear[0].equation)
Out[49]:
$\displaystyle - x \left(y - 1.5\right)$
In [50]:
simplify(KAN_equations_nonlinear[1].equation)
Out[50]:
$\displaystyle y \left(x - 3.0\right)$

Governing Equations of Neural Network Model for Linear System

In [51]:
NN_equations_linear = Equation_Generator(NN_linear, dataset_linear['train_input'], variable_names=["x", "y", "z"])
🔍 Fitting Model for it's Governing Equtions...

✅ Completed Symbolic Regression for the Given Model.

In [52]:
simplify(NN_equations_linear[0].equation)
Out[52]:
$\displaystyle - 0.10000002 x - 2 y$
In [53]:
simplify(NN_equations_linear[1].equation)
Out[53]:
$\displaystyle 2 x - 0.09999999 y$
In [54]:
simplify(NN_equations_linear[2].equation)
Out[54]:
$\displaystyle - 0.3 z$

Governing Equations of Neural Network Model for Nonlinear System

In [55]:
NN_equations_nonlinear = Equation_Generator(NN_nonlinear, dataset_nonlinear['train_input'],  variable_names=["x", "y"])
🔍 Fitting Model for it's Governing Equtions...

✅ Completed Symbolic Regression for the Given Model.

In [56]:
simplify(NN_equations_nonlinear[0].equation)
Out[56]:
$\displaystyle 0.0216451537156584 x - 2.9616516 y + 4.36693188715236$
In [57]:
simplify(NN_equations_nonlinear[1].equation)
Out[57]:
$\displaystyle 1.481901 x - 0.030257374 y - 4.3944111462078$

Conclusion

Using Kolmogorov–Arnold Networks, we can approximate the differential equations of both linear and nonlinear systems. In contrast, Neural Networks can only approximate the linear system; they fail to accurately approximate the nonlinear system. Below is a comparison table for system identification:

Original System Dynamics Derrived Using KAN Derrived Using Neural Network
$\dot{x} = -0.1 x - 2 y\\\dot{y} = 2 x - 0.1 y\\ \dot{z} = -0.3 z$ $\dot{x} = -0.1x − 2.0y\\ \dot{y} = 2.0x − 0.1y\\ \dot{z} = −0.3z$ $\dot{x} = −0.1x −2y\, \text{ (Approx) }\\ \dot{y} = 2.0x − 0.1y\, \text{ (Approx) }\\ \dot{z} = −0.3z\, \text{ (Approx) }$
$\dot{x} = 1.5 x - x y\\ \dot{y} = xy - 3.0y $ $\dot{x} = 1.5x - xy\\ \dot{y} = xy - 3.0y$ $\dot{x} = 0.0216x − 2.9617y + 4.3669\, \text{ (Approx) }\\ \dot{y} = 1.4819x - 0.0303y - 4.3944\, \text{ (Approx) }$

Kolmogorov-Arnold Network is more interpretable than any other AI models and more efficient in system identification. With more computational power, it will be able to perform system identification of higher order dynamical systems.